Generative and Discriminative Methods for Online Adaptation in SMT
نویسنده
چکیده
In an online learning protocol, immediate feedback about each example is used to refine the next prediction. We apply this protocol to statistical machine translation for computer-assisted translation and compare generative and discriminative approaches for online adaptation. We develop our methods on reference translations and test on feedback gathered from professional translators. Experimental results show that improvements of straightforward adaptations of translation and language model are greater than those achieved by discriminative re-ranking. However, the improvements add up to 4 BLEU points over a baseline static model.
منابع مشابه
Discriminative Models and Training Methods For Statistical Machine Translation
Statistical Machine Translation (SMT) has been the dominant avor of Machine Translation (MT) over the last decade. Traditional SMT systems have a pipeline structure in which di erent kinds of Machine Learning models are employed in di erent stages. For the translation modeling, most state of the art systems use hybrid models that combine a handful of generative models in a discriminative framew...
متن کامل(Hidden) Conditional Random Fields Using Intermediate Classes for Statistical Machine Translation
One of the major components of Statistical Machine Translation (SMT) are generative translation models. As in other fields, where the transition from generative to discriminative training resulted in higher performance, it seems likely that translation models should be trained in a discriminative way. But due to the nature of SMT with large vocabularies, hidden alignments, reordering, and large...
متن کاملA Feature-rich Supervised Word Alignment Model for Phrase-based Statistical Machine Translation
Word alignment plays an important role in statistical machine translation (SMT) systems. The output of word alignment can be used to build a phrase table, which is the core model in the decoding of new sentences. Most current SMT systems use GIZA++, a generative model, to automatically align words from sentence-aligned parallel corpora. GIZA++ works well when large sentence-aligned corpora are ...
متن کاملPrincipled Hybrids of Generative and Discriminative Domain Adaptation
We propose a probabilistic framework for domain adaptation that blends both generative and discriminative modeling in a principled way. Under this framework, generative and discriminative models correspond to specific choices of the prior over parameters. This provides us a very general way to interpolate between generative and discriminative extremes through different choices of priors. By max...
متن کاملUnsupervised, Efficient and Semantic Expertise Retrieval
We introduce an unsupervised discriminative model for the task of retrieving experts in online document collections. We exclusively employ textual evidence and avoid explicit feature engineering by learning distributed word representations in an unsupervised way. We compare our model to state-of-the-art unsupervised statistical vector space and probabilistic generative approaches. Our proposed ...
متن کامل